Proyecto Análisis Técnico

Microstructures and trading systems


Rubén Hernández Guevara, if717710@iteso.mx

José Tonatiuh Navarro Silva, if722399@iteso.mx

Héctor Oñate Rivera, if720431@iteso.mx

07/2022 | Repository: Link


Technical Analysis

Exponential Moving Average, Average True Range, Stochastic Relative Strenght Index


Abstract

This project aims to take the technical analysis to another level rather than the traditional; using programming techniques, backtesting, simulations and optimization, but always keeping the fundamentals of TA, i.e., using some traditional technical analysis tools. We will work with 3 of them:

  • Exponential Moving Average
  • Stochastich Relative Strenght Index
  • Average True Range

The Hypothesis is that, when combining this metrics, backtesting and optimizing, we will generate a profit that at the end date, would be better than a traditional and empirical trading strategy.

1. Introduction


Trading consists of negotiating assets in the market, with the objective of obtaining profits. Some strategies have been created and implemented in order to get the most profits possible, such as technical analysis, fundamental analysis, etc. This project aims to take the technical analysis to another level rather than the traditional; using programming techniques, backtesting, simulations and optimization, but always keeping the fundamentals of TA, i.e., using some traditional technical analysis tools. We will work with 3 of them:

  • Exponential Moving Average (EMA)
  • Stochastich Relative Strenght Index (SRSI)
  • Average True Range (ATR)

The strategy is very simple and as we are working under the perspective of the market microstructure, we will work with times lower than one hour.

A buy signal is going to be given when 2 conditions are met:

  • The SRSI 'K' param is above 'D' param for the current time and one previous time, i.e.: $K_i > D_i \ \& \ K_{i-1} > D_{i-1}$.

  • Using 3 Exponential Moving Averages ($ema_1, ema_2, ema_2$) of 3 different lenghts and where: $len(ema_1) < len(ema_2) \ \ \& \ \ len(ema_2) < len(ema_3)$.

These metrics are going to be optimized for a train period, which is from 01/01/2018 to 01/01/2019 this will generate the best parameters and with this parameters a test dataset is going to be simulated; the date for the test is: 01/02/2019 to 01/02/2020. Another parameter for the optimization is a function that weights the sharpe ratio and ...

The Hypothesis is that, when combining this metrics, backtesting and optimizing, we will generate a profit that at the end date, would be better than a traditional and empirical trading strategy.

2. Install/Load Packages and Depedencies


2.1 Python Packages

In order to run this notebook, it is necessary to have installed and/or have the requirements.txt file with the following:

  • python-binance >= 1.0.16
  • backtesting >= 0.3.3
  • ta >= 0.10.1
  • nbformat >= 5.4.0
  • numpy >= 1.23.0
  • pandas >= 1.4.3
  • plotly >= 5.9.0

2.2 Files Dependencies

The following are the file dependencies that are needed to run this notebook:

  • files/BTCUSDT[1h, 3m, 15m, 30m, 1h].csv BTC-USDT data downloaded from Binance trough its API.
  • data.py: constains the ETL process in order to get the data from Binance.
  • main.py: contain all operations and processes necessaries.
  • functions.py: contains main functions used in the whole project.
  • visualizations.py: constains visualization functions with plotly.

2.3 Install Packages

In [ ]:
%%capture

# Install all the pip packages in the requirements.txt
import sys
!{sys.executable} -m pip install -r requirements.txt

2.4 Load Packages and Scripts

In [ ]:
import data as dt
import main as main
import functions as fn
import visualizations as vz
from IPython.display import display, Image

import plotly.io as pio
pio.renderers.default='notebook'

3. Data and Validations


The main dataset used is a BTCUSDT.csv file containing Bitcoin-Tether data from 01-01-2018 to 02-01-2020 in has the following structure:

  • Open: float: Open price
  • High: float: highest price
  • Low: float: lowest price
  • Close: float: close price
  • Volume: float: volume
In [ ]:
# the first 3 rows:
display(dt.BTCUSDT.head(3))
Open High Low Close Volume K D EMA_8 EMA_14 EMA_40 ATR KD_Cross TP SL Buy_signal Sell_signal Outcome
Open Time
2018-01-01 09:45:00 13599.99 13670.00 13571.33 13616.99 51.550990 0.396432 0.353634 13587.796139 13589.883255 13529.312000 136.981575 False 13760.820654 13480.8201 0 0 NaN
2018-01-01 10:00:00 13632.00 13657.92 13540.33 13550.00 50.432284 0.392360 0.387438 13579.396997 13584.565488 13530.321171 135.521109 False 13692.297165 13414.5000 0 0 NaN
2018-01-01 10:15:00 13550.00 13560.98 13497.98 13549.03 47.520692 0.348364 0.379052 13572.648776 13579.827423 13531.233797 130.080363 False 13685.614381 13413.5397 0 0 NaN

1- Initial Capital: $100,000 \ \text{USD}$.

2- Maximum risk per trade: $1,000 \ \text{USD}$. We will cover this using a fraction of the capital per trade.

3- Divide the data in train and test.

train: Jan/01/2018 - Jan/01/2019

test: Feb/01/2019 - Feb/01/2020

4. Financial Aspects

4.1 Defining trading system and its four criteria¶

1. Data usage criteria.

Instrument: BTCUSDT, Cryptocurrency.
Time interval: 3, 15, 30 minutes.
Data structure: Open, High, Low, Close, Volume data.

2. Signal generation criteria.

Buy signals are generated when the 'K' SRSI param is greater than 'D' param (cross) AND when minor lenght EMAs are greater than major lenght EMAs, i.e.: 

$ema_1 > ema_2 \ \ \& \ \ ema_2 > ema_3$.

Sell signals are generated with a 'D'/'K' cross and when a case for a Take Profit or Stop Loss is met. 

3. Take Profit/Stop Loss criteria.

For this criteria we decide using a scalar, but it is always changing as it depends on the Price, Average True Range and a scalar.
Take Profit Criteria:
    price + atr*scalar
Stop Loss Criteria:
    price - atr*scalar

4. Position sizing criteria..

This criteria is also a scalar lower than 10% of total capital, but it is going to be optimized.

4.2 Visual Validation¶

In [ ]:
vz.strategy_test_viz(main.test, main.emas, True)

4.3 MAD Results¶

Below is a comparative table of the results of the three performance attribution metrics that were proposed to be analyzed for this trading strategy, using the data on the evolution of accumulated capital, both from the training period and the trial period, the metrics to be used being the following:

  • Sharpe Ratio:

    It is calculated by subtracting the risk-free rate from the average of the logarithmic returns of the movements, and dividing this result by the standard deviation of the movements (log returns). In general, the higher the value of the Sharpe ratio, the more attractive the risk-adjusted return.

$$\textbf{SR} = \frac{R_p-r_f}{\sigma_p}$$
  • Sortino Ratio:

    is a variation of the Sharpe ratio that differentiates harmful volatility from total overall volatility by using the asset's standard deviation of negative portfolio returns—downside deviation—instead of the total standard deviation of portfolio returns. The Sortino ratio takes an asset or portfolio's return and subtracts the risk-free rate, and then divides that amount by the asset's downside deviation.

$$\textbf{SoR} = \frac{R_p-r_f}{\sigma_d}$$
In [ ]:
main.MAD
Out[ ]:
MAD Train Test
0 Sharpe Ratio 0.0 0.0
1 Sortino Ratio 0.0 0.0
2 Calmar Ratio 0.0 0.0
  • Sharpe Ratio:

    If the index or Sharpe ratio is negative, it indicates that the performance of the movements is lower than the return without risk. Any value of the Sharpe ratio less than one means that the return on the assets is less than the risk we are assuming when investing in a given asset.

  • Sortino Ratio:

    The higher the value of the Sortino ratio, the better the rating of the movements, since it means that these are operated efficiently and unnecessary risks are not being taken, which are not being rewarded with higher returns. A low or negative sortino ratio indicates that the investor is not being rewarded for the risks they are taking.


5. Statistical Aspects

5.1 Make a proposal to the team between 1 and 3 technical studies.¶

First technical study: Stochastic Relative Strength Index (SRSI)¶

  • What is SRSI? The stochastic RSI (StochRSI) is a technical indicator used to measure the strength and weakness of the relative strength indicator (RSI) over a set period of time.

  • How to get the RSIS? For this metric we used this reference https://www.investopedia.com/terms/s/stochrsi.asp as well as a library for technical anaylisis https://technical-analysis-library-in-python.readthedocs.io/en/latest/ta.html, resulting this function:

In [ ]:
help(fn.stochrsi_k)
Help on function stochrsi_k in module functions:

stochrsi_k(close: pandas.core.series.Series, window: int = 14, smooth1: int = 3, smooth2: int = 3, fillna: bool = False) -> pandas.core.series.Series
    Stochastic Relative Strenght Index K (SRSId)
    The SRSI takes advantage of both momentum indicators in order to create a more 
    sensitive indicator that is attuned to a specific security's historical performance
    rather than a generalized analysis of price change.
    
    Args:
        close(pandas.Series): dataset 'Close' column.
        window(int): n period
        smooth1(int): moving average of Stochastic RSI
        smooth2(int): moving average of %K
    
    Returns:
            pandas.Series: New feature generated.
    
    References:
        [1] https://www.investopedia.com/terms/s/stochrsi.asp

Second technical study: Exponential Moving Average (EMA)¶

  • What is EMA? It is a technical indicator that shows how the price of an asset changes over a certain period of time. The EMA is different from a simple moving average in that it places more weight on recent data points.

  • How to get the EMA? For this metric we used the following references:

[1] https://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_averages

[2] https://www.investopedia.com/ask/answers/122314what-exponential-moving-average-ema-formula-and-how-ema-calculated.asp

and the library pandas_ta resulting in this function:

In [ ]:
help(fn.ema)
Help on function ema in module functions:

ema(close, length=None, talib=None, offset=None, **kwargs)
    Exponential Moving Average (EMA)
    The Exponential Moving Average is more responsive moving average compared to the
    Simple Moving Average (SMA).  The weights are determined by alpha which is
    proportional to it's length.  There are several different methods of calculating
    EMA.  One method uses just the standard definition of EMA and another uses the
    SMA to generate the initial value for the rest of the calculation.
    
    Args:
        close (pd.Series): Series of 'close's
        length (int): It's period. Default: 10
        talib (bool): If TA Lib is installed and talib is True, Returns the TA Lib
            version. Default: True
        offset (int): How many periods to offset the result. Default: 0
    
    Returns:
        pd.Series
    
    References:
        [1] https://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_averages
        [2] https://www.investopedia.com/ask/answers/122314/what-exponential-moving-average-ema-formula-and-how-ema-calculated.asp

Third technical study: Average True Range (ATR)¶

  • What is ATR? The Averge True Range is a tool used in technical analysis to measure volatility.

  • How to get the RSIS? In order to get this metric we consulted https://www.tradingview.com/wiki/Average_True_Range_(ATR) and this library: pandas_ta

and the function we got is this:

In [ ]:
help(fn.atr)
Help on function atr in module functions:

atr(high, low, close, length=None, mamode=None, talib=None, drift=None, offset=None, **kwargs)
    Average True Range (ATR)
    Averge True Range is used to measure volatility, especially volatility caused by
    gaps or limit moves.
    
    Args:
        high (pd.Series): Series of 'high's
        low (pd.Series): Series of 'low's
        close (pd.Series): Series of 'close's
        length (int): It's period. Default: 14
        mamode (str): See ```help(ta.ma)```. Default: 'rma'
        talib (bool): If TA Lib is installed and talib is True, Returns the TA Lib
            version. Default: True
        drift (int): The difference period. Default: 1
        offset (int): How many periods to offset the result. Default: 0
    
    Returns:
        pd.Series: New feature generated.
    
    References:
        https://www.tradingview.com/wiki/Average_True_Range_(ATR)

5.2 Algorithm proposal using data for criterion #2¶

These 3 technical analysis tools were used in order to generate Buy or Sell signals:

  • SRSI

  • EMA

  • ATR

They have to met the following conditions:

SRSI (K) needs to be higher than SRSI (D), but only when making a CROSS or break AND:

Close > EMA8 > EMA14 > EMA40.

Below the dataframe we are going to probe every one of the signals and conditions

In [ ]:
x = main.test[['High', 'Low', 'Close', 'K', 'D', 'EMA_8', 'EMA_14', 'EMA_40', 'ATR', 'KD_Cross', 'TP', 'SL', 'Buy_signal', 'Sell_signal', 'Outcome', 'buys', 'sells']].head()
x
Out[ ]:
High Low Close K D EMA_8 EMA_14 EMA_40 ATR KD_Cross TP SL Buy_signal Sell_signal Outcome buys sells
Open Time
2018-01-02 00:00:00 13539.54 13382.16 13490.42 0.181130 0.075330 13448.718135 13434.082810 13354.042734 124.004615 True 13620.624845 13355.5158 1 0 NaN 13490.42 NaN
2018-01-02 00:15:00 13700.04 13467.22 13700.04 0.492088 0.231864 13504.567439 13469.543768 13370.920649 131.783017 False 13838.412168 13563.0396 0 1 TP NaN 13700.04
2018-01-02 00:30:00 13850.00 13689.71 13715.00 0.825422 0.499547 13551.330230 13502.271266 13387.705008 133.820659 False 13855.511692 13577.8500 0 0 NaN NaN NaN
2018-01-02 00:45:00 13770.00 13672.89 13750.01 1.000000 0.772503 13595.481290 13535.303097 13405.378422 131.196760 False 13887.766598 13612.5099 0 0 NaN NaN NaN
2018-01-02 01:00:00 13800.00 13659.45 13662.13 0.875926 0.900449 13610.292114 13552.213351 13417.902889 131.865253 False 13800.588516 13525.5087 0 0 NaN NaN NaN
In [ ]:
kd_cross = x['K'][0] > x['D'][0]
print('K/D Cross is:', kd_cross)
K/D Cross is: True
In [ ]:
EMA_cond = (x['Close'][0] > x['EMA_8'][0]) & (x['EMA_8'][0] > x['EMA_14'][0]) & (x['EMA_14'][0] > x['EMA_40'][0])
print('EMA Condition is:', EMA_cond)
EMA Condition is: True

When both conditions are true, we get a Boolean Buy Signal (True), else we get a False.

In [ ]:
print('Buy signal is:', kd_cross and EMA_cond)
Buy signal is: True

Sell signals are a little bit more complex, because we made som extra steps to avoid false signals or consecutive signals.

In [ ]:
selldates = []
outcome = []
for i in range(len(x)):
    if x.Buy_signal.iloc[i]:
        k = 1
        SL = x.SL.iloc[i]
        TP = x.TP.iloc[i]
        in_position = True
        while in_position:
            if i + k ==len(x):
                break
            looping_high = x.High.iloc[i+k]
            looping_low = x.Low.iloc[i+k]
            if looping_high >= TP:
                selldates.append(x.iloc[i+k].name)
                outcome.append('TP')
                in_position = False
            elif looping_low <= SL:
                selldates.append(x.iloc[i+k].name)
                outcome.append('SL')
                in_position = False
            k += 1
In [ ]:
# We get 2 lists, selldates and outcome
In [ ]:
print(selldates, outcome)
[Timestamp('2018-01-02 00:15:00')] ['TP']

They contain info about the sell movement, specifically the date and if it was a sell because of a Take Profit or a Stop Loss.

In [ ]:
# Then we localize the date and make a new column named Sell_signal and fill it with 1
# Another column is made and it has the value of the outcome (TP or SL)
# x.loc[selldates, 'Sell_signal'] = 1
# x.loc[selldates, 'Outcome'] = outcome
In [ ]:
# With this strategy we are able to buy or sell when all conditions are met.
x[['Buy_signal', 'Sell_signal', 'Outcome', 'buys', 'sells']].head(2)
Out[ ]:
Buy_signal Sell_signal Outcome buys sells
Open Time
2018-01-02 00:00:00 1 0 NaN 13490.42 NaN
2018-01-02 00:15:00 0 1 TP NaN 13700.04

5.3 Technical studies definitions¶

1. Stochastich Relative Strength Index¶

The first step is to calculate the RSI

$RSI_t(n) = 100 \frac{up_t(n)}{up_t(n) + down_t(n)}$¶

Where $up_t(n)$ is de the average of the past 'n' timeframes where the price change was higher than 0 and $down_t(n)$ is the avergae of the past 'n' timeframe where the price was below 0

Once we have the RSI we can calculate de Stochastic RSI

$SRSI_t(n) = \frac{RSI_{t} - Min(RSI(n))}{Max(RSI(n)) - Min(RSI(n))}$¶

Where $RSI_{t}$ = the last RSI in the past n timeframes, $Min(RSI(n))$ = the minimum RSI in the past n timeframes and $Max(RSI(n))$ = the maximum RSI in the past n timeframes

2. Exponential Moving Average¶

In this strategy we use 3 EMA's and thi is the process to calculate EMA with $n$ lagged period at time $t$

$ema(P,t)= \beta P_t + \beta (1-\beta)P_{t-1} + \beta (1-\beta)^{2}P_{t-2} + ...$¶

$=\beta P_t + (1-\beta) ema_{t-1}(P,n) $¶

Where the smoothing coefficient $\beta$ is usually: $\beta = \frac{2}{n+1}$

3. Average True Range¶

The first thing we have to do in order to calculate de ATR is to get the True Range, for this we can follow this formula:

$TR = max[(H - L), |H - C_p|, |L-C_p|]$¶

After this we can get the average now.

$ATR = \frac{1}{n} \sum_{i=1}^{n} TR_i$¶

Where:

  • $TR_i$ is a particular range
  • $n$ is the period of time employed
  • $H$ is the current High
  • $L$ is the current Low
  • $C_p$ is the previous Close

6. Computational Aspects


Objective Function

The MAD we are looking to maximize is the Return (%). We did this trough the Backtesting library.

In order to maximize the return, we also define some parameters to optimize:

  • First Parameter:

    Name: SRSI window

    Description: Indicates how much window of time the SRSI is going to consider.

    Value Type: numeric, int.

    Value Range: [8,9,10,11]

    Minimum Step Size: 1

  • Second Parameter:

    Name: SRSI K

    Description: Stochastic RSI K parameter, moving average.

    Value Type: numeric, int.

    Value Range: [2, 3, 4]

    Minimum Step Size: 1

  • Third Parameter:

    Name: SRSI D

    Description: Stochastic RSI D parameter, moving average.

    Value Type: numeric, int.

    Value Range: [2, 3, 4]

    Minimum Step Size: 1

  • Fourth Parameter:

    Name: EMA1 lenght

    Description: The period to calculate.

    Value Type: numeric, int.

    Value Range: [6,7,8,9]

    Minimum Step Size: 1

  • Fifth Parameter:

    Name: EMA2 lenght

    Description: The period to calculate.

    Value Type: numeric, int.

    Value Range: [11,12,13,14]

    Minimum Step Size: 1

  • Sixth Parameter:

    Name: EMA3 lenght

    Description: The period to calculate.

    Value Type: numeric, int.

    Value Range: [35,37,38,44]

    Minimum Step Size: 1

ScreenShot of Params:

*Due to computational matters, we couldn't simply call the opt paramns, so we used SS.

In [ ]:
Image(filename = 'files/params.png')
Out[ ]:
In [ ]:
Image(filename = 'files/params2.png')
Out[ ]:
  • Search Space:

    $\text{number of parameters (m): 4}$

    $\text{Possible values (n): 4}$

    $\text{Search space (m\^ n)}= 256$

We obtained:

In [ ]:
Image(filename = 'files/opt_params.png')
Out[ ]:

6.1 Capital Evolution for train¶

In [ ]:
vz.Equity_viz(main.output_train._equity_curve['Equity'].index, main.output_train._equity_curve['Equity'], True)

6.2 Capital Evolution for test¶

In [ ]:
vz.Equity_viz(main.output_test._equity_curve['Equity'].index, main.output_test._equity_curve['Equity'],  True)

8. Conclusions

Even though we did not applied the strategy from a microestructure point of view, we did applied many things of the course, starting with the organization, the way of presenting the project and the dinamyc of calculating and being aware of little details.

We first were struggling doing research of which strategy put on practice so we spent great time discussing and sharing with the team some strategies, finally we found this one and chose to work on it because the metrics made sense for use and here is why:

As the RSRS is an indicator which concentrates on market momentum and we read that succeeds at providing readings for overbought and oversold market conditions and the EMA's are the most popular metrics identifying trends, combining both of them with the ATR in order to calculate an appropiate TP because the ATR is volatility indicator, this may lead to get a useful strategy.

After doing the backtesting and the optimization we found that the strategy is not ready for being deployed because the equity return[%] was below our expectations and in order to get a profitable strategy we must try with many more metrics.

Besides the results we are satisfied with the work we did and we are gladly to say that this project has encouraged us to keep learning this kind of topics.

9. References


[1] Library Pandas technical anaylisis https://technical-analysis-library-in-python.readthedocs.io/en/latest/ta.html

[2] Plotly Documentation https://plotly.com/

[3] Average True Range https://www.tradingview.com/wiki/Average_True_Range_(ATR)

[4] Stochastich Relative Strengh Index https://www.investopedia.com/terms/s/stochrsi.asp

[5] Exponential Moving Average https://www.investopedia.com/ask/answers/122314/what-exponential-moving-average-ema-formula-and-how-ema-calculated.asp